Glottal flow 신호에서의 향상된 특징추출 및 다중 특징파라미터 결합을 통한 화자인식 성능 향상

강지훈; 김영일; 정상배; Jihoon Kang; Youngil Kim; Sangbae Jeong

연구문헌

국내 논문지

홈 > 연구문헌 > 국내 논문지 > 한국정보통신학회 논문지 (Journal of the Korea Institute of Information and Communication Engineering)

한국정보통신학회 논문지 (Journal of the Korea Institute of Information and Communication Engineering)

Current Result Document :

한글제목(Korean Title)	Glottal flow 신호에서의 향상된 특징추출 및 다중 특징파라미터 결합을 통한 화자인식 성능 향상
영문제목(English Title)	Performance Improvement of Speaker Recognition Using Enhanced Feature Extraction in Glottal Flow Signals and Multiple Feature Parameter Combination
저자(Author)	강지훈 김영일 정상배 Jihoon Kang Youngil Kim Sangbae Jeong
원문수록처(Citation)	VOL 19 NO. 12 PP. 2792 ~ 2799 (2015. 12)
한글내용 (Korean Abstract)	본 논문에서는 화자 인식의 성능을 개선하기 위해서 glottal flow로부터 source mel-frequency cepstral coefficient (SMFCC), 왜도, 첨도를 추출하여 활용하였다. 일반적으로 glottal flow의 고주파 대역은 응답의 크기가 평탄하므로 미리 정한 차단주파수 미만에 대해서만 SMFCC를 추출한다. 추출된 SMFCC, 왜도, 첨도는 종래의 특징 파라미터와 결합된 후 종래의 화자인식 시스템과 동등한 조건에서의 성능 비교를 위하여 principal component analysis (PCA) 및 linear discriminiat analysis (LDA)를 통한 차원축소가 행해진다. 대용량의 화자인식 실험결과를 통해서 제안된 인식 시스템이 종래의 화자인식 시스템 보다 더 좋은 성능을 나타냄을 확인할 수 있었으며, 특히 가우시안 혼합이 낮을 때 더 높은 성능향상을 나타내었다.
영문내용 (English Abstract)	In this paper, we utilize source mel-frequency cepstral coefficients (SMFCCs), skewness, and kurtosis extracted in glottal flow signals to improve speaker recognition performance. Generally, because the high band magnitude response of glottal flow signals is somewhat flat, the SMFCCs are extracted using the response below the predefined cutoff frequency. The extracted SMFCC, skewness, and kurtosis are concatenated with conventional feature parameters. Then, dimensional reduction by the principal component analysis (PCA) and the linear discriminat analysis (LDA) is followed to compare performances with conventional systems under equivalent conditions. The proposed recognition system outperformed the conventional system for large scale speaker recognition experiments. Especially, the performance improvement was more noticeable for small Gaussan mixtures.
키워드(Keyword)	화자인식 성문파 왜도 첨도 주성분 분석 주요인 분석 speaker recognition glottal flow skewness kurtosis PCA LDA
파일첨부	PDF 다운로드